Goto

Collaborating Authors

 lasso 0




High-dimensional Bayesian Tobit regression for censored response with Horseshoe prior

Mai, The Tien

arXiv.org Machine Learning

Censored response variables--where outcomes are only partially observed due to known bounds--arise in numerous scientific domains and present serious challenges for regression analysis. The Tobit model, a classical solution for handling left-censoring, has been widely used in economics and beyond. However, with the increasing prevalence of high-dimensional data, where the number of covariates exceeds the sample size, traditional Tobit methods become inadequate. While frequentist approaches for high-dimensional Tobit regression have recently been developed, notably through Lasso-based estimators, the Bayesian literature remains sparse and lacks theoretical guarantees. In this work, we propose a novel Bayesian framework for high-dimensional Tobit regression that addresses both censoring and sparsity. Our method leverages the Horseshoe prior to induce shrinkage and employs a data augmentation strategy to facilitate efficient posterior computation via Gibbs sampling. We establish posterior consistency and derive concentration rates under sparsity, providing the first theoretical results for Bayesian Tobit models in high dimensions. Numerical experiments show that our approach outperforms favorably with the recent Lasso-Tobit method. Our method is implemented in the R package tobitbayes, which can be found on Github.


GFSNetwork: Differentiable Feature Selection via Gumbel-Sigmoid Relaxation

Wydmański, Witold, Śmieja, Marek

arXiv.org Artificial Intelligence

Feature selection in deep learning remains a critical challenge, particularly for high-dimensional tabular data where interpretability and computational efficiency are paramount. We present GFSNetwork, a novel neural architecture that performs differentiable feature selection through temperature-controlled Gumbel-Sigmoid sampling. Unlike traditional methods, where the user has to define the requested number of features, GFSNetwork selects it automatically during an end-to-end process. Moreover, GFSNetwork maintains constant computational overhead regardless of the number of input features. We evaluate GFSNetwork on a series of classification and regression benchmarks, where it consistently outperforms recent methods including DeepLasso, attention maps, as well as traditional feature selectors, while using significantly fewer features. Furthermore, we validate our approach on real-world metagenomic datasets, demonstrating its effectiveness in high-dimensional biological data. Concluding, our method provides a scalable solution that bridges the gap between neural network flexibility and traditional feature selection interpretability. We share our python implementation of GFSNetwork at https://github.com/wwydmanski/GFSNetwork, as well as a PyPi package (gfs_network).


PALMS: Parallel Adaptive Lasso with Multi-directional Signals for Latent Networks Reconstruction

Xing, Zhaoyu, Zhong, Wei

arXiv.org Machine Learning

Networks are commonly existing in our world, which characterize the interactions between different items in many fields, such as the social networks between people and the trading networks between companies. With the deepening of research into various complex dynamic systems, network data and network-based dynamic processes have increasingly become focal points of academic inquiry. From the perspective of empirical analysis, the network structures commonly influence the changes and evolution of the world profoundly (Dhar et al., 2014), like transportation networks between cities, supply chain networks for international trades, and competition relationships in an evolutionary ultimatum game. Many scholars regard the known network structures as a treatment, focusing on whether existing network connections exert an influence on other variables, which is commonly referred to as network effects identification. Examples include the impact of transportation networks on economic development (Bramoullé et al., 2009) and the influence of social networks on U.S. election outcomes (Herzog, 2021; Kleinnijenhuis and De Nooy, 2013).


Forecasting Particle Accelerator Interruptions Using Logistic LASSO Regression

Li, Sichen, Snuverink, Jochem, Perez-Cruz, Fernando, Adelmann, Andreas

arXiv.org Artificial Intelligence

Unforeseen particle accelerator interruptions, also known as interlocks, lead to abrupt operational changes despite being necessary safety measures. These may result in substantial loss of beam time and perhaps even equipment damage. We propose a simple yet powerful binary classification model aiming to forecast such interruptions, in the case of the High Intensity Proton Accelerator complex at the Paul Scherrer Institut. The model is formulated as logistic regression penalized by least absolute shrinkage and selection operator, based on a statistical two sample test to distinguish between unstable and stable states of the accelerator. The primary objective for receiving alarms prior to interlocks is to allow for countermeasures and reduce beam time loss. Hence, a continuous evaluation metric is developed to measure the saved beam time in any period, given the assumption that interlocks could be circumvented by reducing the beam current. The best-performing interlock-to-stable classifier can potentially increase the beam time by around 5 min in a day. Possible instrumentation for fast adjustment of the beam current is also listed and discussed.


DeepMed: Semiparametric Causal Mediation Analysis with Debiased Deep Learning

Xu, Siqi, Liu, Lin, Liu, Zhonghua

arXiv.org Artificial Intelligence

Causal mediation analysis can unpack the black box of causality and is therefore a powerful tool for disentangling causal pathways in biomedical and social sciences, and also for evaluating machine learning fairness. To reduce bias for estimating Natural Direct and Indirect Effects in mediation analysis, we propose a new method called DeepMed that uses deep neural networks (DNNs) to cross-fit the infinite-dimensional nuisance functions in the efficient influence functions. We obtain novel theoretical results that our DeepMed method (1) can achieve semiparametric efficiency bound without imposing sparsity constraints on the DNN architecture and (2) can adapt to certain low dimensional structures of the nuisance functions, significantly advancing the existing literature on DNN-based semiparametric causal inference. Extensive synthetic experiments are conducted to support our findings and also expose the gap between theory and practice. As a proof of concept, we apply DeepMed to analyze two real datasets on machine learning fairness and reach conclusions consistent with previous findings.


Causal Feature Selection via Orthogonal Search

Raj, Anant, Bauer, Stefan, Soleymani, Ashkan, Besserve, Michel, Schölkopf, Bernhard

arXiv.org Machine Learning

The problem of inferring the direct causal parents of a response variable among a large set of explanatory variables is of high practical importance in many disciplines. Recent work in the field of causal discovery exploits invariance properties of models across different experimental conditions for detecting direct causal links. However, these approaches generally do not scale well with the number of explanatory variables, are difficult to extend to nonlinear relationships, and require data across different experiments. Inspired by {\em Debiased} machine learning methods, we study a one-vs.-the-rest feature selection approach to discover the direct causal parent of the response. We propose an algorithm that works for purely observational data, while also offering theoretical guarantees, including the case of partially nonlinear relationships. Requiring only one estimation for each variable, we can apply our approach even to large graphs, demonstrating significant improvements compared to established approaches.


Weighted Lasso Estimates for Sparse Logistic Regression: Non-asymptotic Properties with Measurement Error

Huang, Huamei, Gao, Yujing, Zhang, Huiming, Li, Bo

arXiv.org Machine Learning

When we are interested in high-dimensional system and focus on classification performance, the $\ell_{1}$-penalized logistic regression is becoming important and popular. However, the Lasso estimates could be problematic when penalties of different coefficients are all the same and not related to the data. We proposed two types of weighted Lasso estimates depending on covariates by the McDiarmid inequality. Given sample size $n$ and dimension of covariates $p$, the finite sample behavior of our proposed methods with a diverging number of predictors is illustrated by non-asymptotic oracle inequalities such as $\ell_{1}$-estimation error and squared prediction error of the unknown parameters. We compare the performance of our methods with former weighted estimates on simulated data, then apply these methods to do real data analysis.


A Support Detection and Root Finding Approach for Learning High-dimensional Generalized Linear Models

Huang, Jian, Jiao, Yuling, Kang, Lican, Liu, Jin, Liu, Yanyan, Lu, Xiliang

arXiv.org Machine Learning

Feature selection is important for modeling high-dimensional data, where the number of variables can be much larger than the sample size. In this paper, we develop a support detection and root finding procedure to learn the high dimensional sparse generalized linear models and denote this method by GSDAR. Based on the KKT condition for $\ell_0$-penalized maximum likelihood estimations, GSDAR generates a sequence of estimators iteratively. Under some restricted invertibility conditions on the maximum likelihood function and sparsity assumption on the target coefficients, the errors of the proposed estimate decays exponentially to the optimal order. Moreover, the oracle estimator can be recovered if the target signal is stronger than the detectable level. We conduct simulations and real data analysis to illustrate the advantages of our proposed method over several existing methods, including Lasso and MCP.